TreeGNG - hierarchical topological clustering

نویسندگان

  • Kevin Doherty
  • Rod Adams
  • Neil Davey
چکیده

This paper presents TreeGNG, a top-down unsupervised learning method that produces hierarchical classification schemes. TreeGNG extends the Growing Neural Gas algorithm by maintaining a time history of the learned topological mapping. TreeGNG is able to recover from poor decisions made during the construction of the tree, and provides the novel ability to influence the general shape of the hierarchy. 1 Background In this paper we present TreeGNG, a new hierarchical clustering algorithm based on a time audit trail of the graph connectivity of the Growing Neural Gas (GNG) [4] algorithm. GNG grows and prunes the network components in response to incremental learning. The dynamic nature of the network can produce disjoint graph structures, and these disconnected graphs “can be used to identify clusters in the input data” [5]. By maintaining a time audit trail of the graph connectivity, we can uncover hierarchical structure within the data. The use of unsupervised learning techniques to discover the hierarchy of concepts from unlabelled data, primarily focus on extensions to Kohonen’s Selforganising Feature Map (SOM) [8]. Recently, two hierarchical neural clusterers, HiGCS [1] and TreeGCS [6], have been proposed, both based on Bernd Fritzke’s Growing Cell Structures (GCS) [3]. However, there are some limitations of GCS, which are detailed in [7] and [1], and the results of our own evaluation of GCS concur with these findings. The models produced by the GCS and SOM algorithms are a topological mapping only if the topology of the graph matches the topological structure of the manifold from which the data are drawn. In [9], it was shown that an algorithm for the homogeneous distribution of the network nodes combined with the competitive Hebb rule produces the induced Delaunay triangulation of the manifold. GNG is an incremental implementation of this node distribution and competitive Hebbian learning concept, and the resultant graph structures are a topology preserving map of the manifold. It was reported in [2], that GNG converges rapidly and the quality of the clustering has little dependence on the network parameters. The results of our own evaluation of GNG agree with these findings. 2 TreeGNG Inspired by the claimed capabilities of TreeGCS, we investigated its performance. Whilst we believe the TreeGCS event auditing approach has potential, the resultant trees are binary, and we remain unconvinced as to the suitability of GCS as the underlying algorithm. In TreeGNG (Fig.1), we use GNG as the partitioning algorithm, and whilst the tree growth and pruning mechanisms follow the general outline of TreeGCS, we believe the inclusion of a growth window can improve the quality of the tree representation of data, as the resultant TreeGNG trees are not necessarily binary.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hierarchical Growing Neural Gas

This paper describes TreeGNG, a top-down unsupervised learning method that produces hierarchical classification schemes. TreeGNG is an extension to the Growing Neural Gas algorithm that maintains a time history of the learned topological mapping. TreeGNG is able to correct poor decisions made during the early phases of the construction of the tree, and provides the novel ability to influence th...

متن کامل

A New Hierarchical Clustering Method using Topological Map

We present a new hierarchical clustering criteria which can be applied to data set. This is done after generating an initial partition by using a Topological Self Organizing Map. This criteria contains two terms which take into account two di erent errors simultaneously: the square error of the entire clustering (as the Ward criteria) and the topological structure given by the Self Organizing M...

متن کامل

Data Communications Through Large Packet Switching Networks

The topological design and adaptive routing procedure for computer networks becomes infeasible under their present form as the number of network nodes grows. In this paper we present, optimize and evaluate hierarchical procedures to be used in the case of large networks. These procedures are an extension of present schemes and rely on a hierarchical clustering of the network nodes. Models are d...

متن کامل

Extracting Knowledge from the Geometric Shape of Social Network Data Using Topological Data Analysis

Topological data analysis is a noble approach to extract meaningful information from high-dimensional data and is robust to noise. It is based on topology, which aims to study the geometric shape of data. In order to apply topological data analysis, an algorithm called mapper is adopted. The output from mapper is a simplicial complex that represents a set of connected clusters of data points. I...

متن کامل

Hierarchical characterization of complex networks

While the majority of approaches to the characterization of complex networks has relied on measurements considering only the immediate neighborhood of each network node, valuable information about the network topological properties can be obtained by considering further neighborhoods. The current work discusses on how the concepts of hierarchical node degree and hierarchical clustering coeffici...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005